{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Scatterplots in pydeck: A case study using Beijing subway stops\n",
    "\n",
    "Below we'll plot the location of Beijing subway stops over time, half of which have been built since 2015. Locations for subway stops come from [Wikipedia](https://en.wikipedia.org/wiki/List_of_Beijing_Subway_stations) and [OpenStreetMap](https://wiki.openstreetmap.org/wiki/Beijing_Subway). This is not a rigorous study, so some subway stops may be missing.\n",
    "\n",
    "## Contents\n",
    "\n",
    "- [Getting the data](#Getting-the-data)\n",
    "- [Data cleaning](#Data-cleaning)\n",
    "- [Using pydeck to automatically create a viewport](#Automatically-generate-a-viewport)\n",
    "- [Plotting the data](#Plotting-the-data)\n",
    "- [Plotting the data over time](#Playing-the-data-forward-in-time)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Getting the data\n",
    "\n",
    "First, we can use the [Pandas library](https://pandas.pydata.org/) to download our data. You're likely already familiar with it–Pandas is a very popular library in Python for filtering, aggregating, and joining data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import pydeck as pdk\n",
    "\n",
    "# First, let's use Pandas to download our data\n",
    "URL = 'https://raw.githubusercontent.com/ajduberstein/data_sets/master/beijing_subway_station.csv'\n",
    "df = pd.read_csv(URL)\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Data cleaning\n",
    "\n",
    "Next, we'll have to engage in some necessary data housekeeping. The CSV encodes the `[R, G, B, A]` color values a `str`, and `literal_eval` lets us convert that string a list."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from ast import literal_eval\n",
    "# We have to re-code position to be one field in a list, so we'll do that here:\n",
    "# The CSV encodes the [R, G, B, A] color values listed in it as a string\n",
    "df['color'] = df.apply(lambda x: literal_eval(x['color']), axis=1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Automatically generate a viewport\n",
    "\n",
    "pydeck features some utilities for visualizing data, like an **automatic zoom using `data_utils.compute_view`** for 2D data sets.\n",
    "\n",
    "We'll render the viewport, as well, just to verify that the visualization looks sensible."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Use pydeck's data_utils module to fit a viewport to the central 90% of the data\n",
    "viewport = pdk.data_utils.compute_view(points=df[['lng', 'lat']], view_proportion=0.9)\n",
    "auto_zoom_map = pdk.Deck(layers=[], initial_view_state=viewport)\n",
    "auto_zoom_map.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Sure enough, we're centered to Beijing.\n",
    "\n",
    "# Plotting the data\n",
    "\n",
    "We'll render the data and use some Jupyter notebook functionality to provide a header with a year.\n",
    "\n",
    "It's worth spending some time on each line, if you haven't seen the Layer object yet:\n",
    "\n",
    "```python\n",
    "scatterplot = Layer(\n",
    "    'ScatterplotLayer',\n",
    "    df,\n",
    "    get_radius=500,\n",
    "    get_fill_color='color',\n",
    "    get_position='position')\n",
    "```\n",
    "\n",
    "\n",
    "\n",
    "**We can specify the layer type as the first argument, the data as the second, and the layer arguments as keywords.** **[ScatterplotLayer](https://github.com/visgl/deck.gl/blob/master/docs/layers/scatterplot-layer.md)** is one of a list of layers available in the deck.gl core library. We'll also provide a header to list the year using some built-in Jupyter notebook tools.\n",
    "\n",
    "For a list of other layers, see the [deck.gl documentation](https://github.com/visgl/deck.gl/tree/master/docs/layers#deckgl-layer-catalog-overview). Remember that deck.gl is a JavaScript library and not a Python one, so the documentation may differ for some kinds of terminology and functionality (e.g., pydeck doesn't support passing functions as arguments but this is a common occurrence within deck.gl)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "from IPython.display import display\n",
    "import ipywidgets\n",
    "\n",
    "year = 2019\n",
    "\n",
    "scatterplot = pdk.Layer(\n",
    "    'ScatterplotLayer',\n",
    "    df,\n",
    "    get_position=['lng', 'lat'],\n",
    "    get_radius=500,\n",
    "    get_fill_color='color')\n",
    "r = pdk.Deck(scatterplot, initial_view_state=viewport)\n",
    "\n",
    "# Create an HTML header to display the year\n",
    "display_el = ipywidgets.HTML('<h1>{}</h1>'.format(year))\n",
    "display(display_el)\n",
    "# Show the current visualization\n",
    "r.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Playing the data forward in time\n",
    "\n",
    "Finally, we can loop through the data and see the dramatic development in Beijing since 1971, as demonstrated by subway stop opening dates."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import time\n",
    "for y in range(1971, 2020):\n",
    "    scatterplot.data = df[df['opening_date'] <= str(y)]\n",
    "    year = y\n",
    "    # Reset the header to display the year\n",
    "    display_el.value = '<h1>{}</h1>'.format(year)\n",
    "    r.update()\n",
    "    time.sleep(0.2)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
